Integrated Metabolomic and Transcriptomic Analysis Reveals Differential Mechanism of Flavonoid Biosynthesis in Two Cultivars of Angelica sinensis

Angelica sinensis is a traditional Chinese medicinal plant that has been primarily used as a blood tonic. It largely relies on its bioactive metabolites, which include ferulic acid, volatile oils, polysaccharides and flavonoids. In order to improve the yield and quality of A. sinensis, the two cultivars Mingui 1 (M1), with a purple stem, and Mingui 2 (M2), with a green stem, have been selected in the field. Although a higher root yield and ferulic acid content in M1 than M2 has been observed, the differences of flavonoid biosynthesis and stem-color formation are still limited. In this study, the contents of flavonoids and anthocyanins were determined by spectrophotometer, the differences of flavonoids and transcripts in M1 and M2 were conducted by metabolomic and transcriptomic analysis, and the expression level of candidate genes was validated by qRT-PCR. The results showed that the contents of flavonoids and anthocyanins were 1.5- and 2.6-fold greater in M1 than M2, respectively. A total of 26 differentially accumulated flavonoids (DAFs) with 19 up-regulated (UR) and seven down-regulated (DR) were obtained from the 131 identified flavonoids (e.g., flavonols, flavonoid, isoflavones, and anthocyanins) in M1 vs. M2. A total 2210 differentially expressed genes (DEGs) were obtained from the 34,528 full-length isoforms in M1 vs. M2, and 29 DEGs with 24 UR and 5 DR were identified to be involved in flavonoid biosynthesis, with 25 genes (e.g., CHS1, CHI3, F3H, DFR, ANS, CYPs and UGTs) mapped on the flavonoid biosynthetic pathway and four genes (e.g., RL1, RL6, MYB90 and MYB114) belonging to transcription factors. The differential accumulation level of flavonoids is coherent with the expression level of candidate genes. Finally, the network of DAFs regulated by DEGs was proposed. These findings will provide references for flavonoid production and cultivars selection of A. sinensis.


Introduction
Angelica sinensis (Oliv.) Diels (syn. Angelica polymorpha Maxim. var. sinensis Oliv.) is an Apiaceae (alt. Umbelliferae) perennial rhizomatous species and commonly named as Dang gui, Dong quai and Toki [1]. The species is originally native to China, with a population center in Gansu and widely cultivated at altitudes of 2000 to 3000 m [1][2][3]. The roots were first recorded in the earliest known herbal text "Shen Nong Ben Cao Jing", and have been used as a traditional Chinese medicine for nourishing the blood, regulating female menstrual disorders, relieving pain, and relaxing the bowels, etc., for over 2000 years [4,5]. Recently, the roots have been used as potential treatments of acute ischemic stroke, chronic obstructive pulmonary disease with pulmonary hypertension, as well as for its cardiocerebrovascular, immunomodulatory and antioxidant effects [5][6][7]. Phytochemical and pharmacological investigations have demonstrated that these therapeutic properties largely rely on bioactive metabolites including ferulic acid, volatile oils, polysaccharides and flavonoids [1,5,8].
For the difference of bioactive metabolites between M1 and M2, greater ferulic acid content and less ligustilide content has been observed in M1 in comparison with M2 [19,20]. The differences of other bioactive metabolites, including volatile oils, polysaccharides and flavonoids, as well as the stem-color formation between M1 and M2, have not been investigated. In this study, we examined the differences of flavonoids and transcripts based on metabolomic and transcriptomic analysis, and found that the flavonoid and anthocyanin contents were greater in M1 than M2; 26 flavonoids were differentially accumulated; and 29 genes involved in flavonoid biosynthesis were differentially expressed in M1 vs. M2.

Differenence of Flavonoid and Anthocyanin Contents between the Two Cultivars
As is shown in Figure 1, a significant difference of flavonoid and anthocyanin contents was observed, with a 1.48-and 2.57-fold greater amount in M1 than M2, respectively. (mean ± SD, n = 3). A t-test was performed for independent treatments, and the "*" is considered significant at p < 0.05 between M1 and M2.

Isoforms Analysis
A total of 702,133 high-fidelity reads were extracted after 38 full passes of raw reads ( Figure S4A), 45,026 polished high-quality isoforms were obtained using a Quiver calculation ( Figure S4B), and 34,528 full-length isoforms were generated after the full-length non-chimeric (FLNC) reads clustered and integrated ( Figure S4C).

Global Gene Analysis
To reveal molecular mechanisms responsible for the difference of flavonoid accumulation and the stem-color formation, comparison of gene transcription between M1 and M2 was performed. After data filtering, 38.65 and 38.73 million high-quality reads were collected, and 27.92 and 28.36 multiple mapped reads were obtained from the M1 and M2, respectively. Meanwhile, the exon rate reached 100% (Table 2).

Functional Annotation and Enrichment of DEGs
The function of the 2210 DEGs was annotated against the Gene Ontology (GO) and KO databases. For the GO database, 48 terms were classified into biological process (22), cellular component (16), and molecular function (10) ( Figure S9). For the KO database, 1784 DEGs were enriched 103 pathways, with top 10 pathways including: oxidative phosphorylation; metabolic pathways; linoleic acid metabolism; ABC transporters; alpha-Linolenic acid metabolism; nitrogen metabolism; phenylpropanoid biosynthesis; TCA cycle; cutin, suberine and wax biosynthesis; and pyruvate metabolism ( Figure 6).

Network of DAFs Regulated by DEGs
The 26 DAFs and 25 DEGs (exclude 4 TFs) were connected based on the flavonoid biosynthetic pathway analyzed by KO enrichment and biological function of proteins on the SwissProt database, and the proposed biosynthetic pathway is shown in Figure 7.
Flavonoids are synthesized via the phenylpropanoid pathway. Briefly, the upstream metabolite 4-coumaroyl-CoA is formed from phenylalanine by the catalyzation of PAL, C4H and 4CL. The 4-coumaroyl-CoA is converted into two metabolites, isoliquiritigenin and naringenin chalcone, by the catalyzation of CHS, then respectively transformed to liquiritigenin and naringenin by the catalyzation of CHI. In the sub-pathway of isoflavonoid biosynthesis, 13 cytochrome P450 monooxygenases (CYPs) are involved, and butin-7-O-glucoside (21) (Table 1), and the enzymes encoded by the 25 DEGs are colored in red.

qRT-PCR Validation of Candidate Genes Involved in Flavonoid Biosynthesis
As shown in Figure 7, 25 DEGs were mapped in the pathway of flavonoid biosynthesis, with 20 UR and 5 DR in M1 vs. M2 (Table 3). Among them, 22 genes (20 UR and 2 DR) were selected to qRT-PCR validation, and their relative expression levels (RELs) were consistent with RPKM values (Table 3, Figure 8).  Figure 8D).

Discussion
Accumulation of secondary metabolites is not only influenced by environmental factors (e.g., temperature, light, the supply of water and minerals) but also genotypes (e.g., variety, strain and cultivar) [21][22][23]. Previous studies have demonstrated that there is a significant difference of secondary metabolites among the three Angelica species: A. sinensis (Oliver) Diels, A. dahurica (Fisch. ex Hoffn) Benth, et Hook. F, and A. pubescens Maxim [5]. A higher root yield of M1 than M2 was observed [16][17][18]. In this study, the flavonoid and anthocyanin contents were greater in M1 than M2, 26 DAFs (19 UR and 7 DR) and 29 DEGs (24 UR and 5 DR) involved in flavonoid biosynthesis were observed in M1 vs. M2.
Flavonoids are widely distributed secondary metabolites with different metabolic functions in plants, such as providing colors attractive to plant pollinators, promoting physiological survival, and protecting plants from fungal pathogens and UV-B radiation; meanwhile, flavonoids possess antifungal, antioxidant and anticancer activities [24,25]. In this study, a 1.48 increase of flavonoid content was observed in M1 compared to M2 (Figure 1), suggesting that the adaptation ability of M1 to environmental conditions is stronger than that of M2, which is consistent with previous investigations that the yield and tolerance of M1 is greater and stronger than that of M2 in the field [16][17][18]. Additionally, a 2.57-fold greater anthocyanin content was observed in M1 compared to M2 (Figure 1), which can describe the difference of stem-color formation for M1 with purple stem and M2 with green stem. Several studies have reported that the contents of flavonoids and anthocyanins play a positive role in pigmentation [26,27].
Extensive experiments have demonstrated that the expression of genes encoding enzymes and TFs is responsible for the formation of flavonoid structures and their subsequent modification reactions [29]. In this study, 29 genes participating in flavonoid biosynthesis were screened from the 2210 DEGs in M1 vs. M2; the specific role of the 29 genes has been linked with the 26 DAFs and mapped on the phenylpropanoid pathway (Figures 5 and 7; Tables 3 and 5). Table 5. Primer sequence of candidate genes used for qRT-PCR validation.
CYPs play diverse roles in metabolism including the synthesis of secondary metabolites (e.g., flavonoids, alkaloids and lignan) [33,34]. Previous studies have found that the overexpression of CYPs genes promotes flavonoid and pigment biosynthesis [35,36]. In this study, 13 CYPs genes directly participate in isoflavonoid biosynthesis with 10 UR (Figure 7; Table 3), which will enhance the flavonoid biosynthesis and greater accumulation in M1 compared to M2 (Figure 1).
UDP-glycosyltransferases (UGTs) is one of the glycosyltransferases that comprise a highly divergent and polyphyletic multigene family involved in widespread glycosylation of plant secondary metabolites (e.g., anthocyanins) [37]. In this study, five UGTs genes were observed to participate in anthocyanin biosynthesis. Previous studies have found that CGT is involved in the biosynthesis of mangiferin [38], GT6 is involved in the formation of flavonol 3-O-glucosides [39], UGT85A8 is involved in glycosylate diterpenes or flavonols, UGT73C6 is involved in flavonol biosynthetic process while possessing low quercetin 3-Oglucosyltransferase, 7-O-glucosyltransferase and 4 -O-glucosyltransferase activities [40,41], and F3GT1 is involved in anthocyanin biosynthesis by catalyzing the galactosylation of cyanidin [42]. Meanwhile, two genes encoding malonyltransferase (MaT) that is also involved in anthocyanin biosynthesis include: 3MaT involved in the transfer of the malonyl group from malonyl-CoA to pelargonidin 3-O-glucoside to produce pelargonidin 3-O-6 -O-malonylglucoside [43], and P5MaT involved in the transfer of the malonyl group from malonyl-CoA to the 4 -hydroxyl group of the 5-glucosyl moiety of anthocyanins [44].
TFs play a great role in controlling cellular processes and MYB TF family is involved in controlling various processes such as responses to biotic and abiotic stresses, development, and metabolism, etc [45]. Several investigations have found that the overexpression of MYB TFs promote flavonoid and anthocyanin biosynthesis [46,47]. In this study, 4 MYB TFs were observed to be in involved in anthocyanin biosynthesis. The two TFs RL1 and RL6 as a member of the MYB-related gene family may regulate the anthocyanin biosynthesis [48]. The two TFs MYB90 and MYB114 are transcription activators, when associated with BHLHs/MYCs, EGL3, or GL3, they promote the synthesis of phenylpropanoid-derived compounds such as anthocyanins [49,50].

Plant Material
Functional leaves and petioles of two-year-old Angelica sinensis [two cultivars: Mingui 1 (M1) with purple stem and Mingui 2 (M2) with green stem, Figure S10] were collected from the city-owned breeding garden located in Shangconggou village, Huichuan Town, Weiyuan County, Dingxi City (2507 m a.s.l.; 35 • 2 39 N, 104 • 1 55 E) of Gansu province, China in July 2020. The two cultivars were identified by Professor Ling Jin (Gansu University of Chinese Medicine, Lanzhou, China). Voucher specimens (M1: 20190725GSWYMG1, M2: 20190725GSWYMG2) were deposited in the herbarium of College of Pharmacy, Gansu University of Chinese Medicine, Lanzhou, China. During the growth stages, the two cultivars were maintained with the same field management conditions. The collected samples (leaves and petioles = 1:1, g/g fresh weight; n = 20 plants) were immediately frozen in liquid nitrogen for total flavonoid and anthocyanin measurement, metabolomic and transcriptomic analysis.

Chemicals
Standards of metabolites used for UPLC analysis were purchased from BioBioPha (Kunming, Yunnan, China) and Sigma-Aldrich (St Louis, MO, CA, USA). All chemicals and reagents (e.g., AlCl 3 , catechin, ethanol, HCL, methanol, NaNO 2 and NaOH) were of analytical grade and purchased from Merck, Germany. Trizol reagent, RT Kit and SuperReal PreMix were purchased from Tiangen, China.

Measurement of Total Flavonoid Content
Fresh samples (0.5 g) were placed in ethanol (5 mL, 95% v/v) and ground, the homogenate was centrifuged at 5000 r/min for 10 min at 4 • C and re-extracted twice more. The extracts were increased to 20 mL with ethanol (95% v/v). Total flavonoids content was measured according to a NaNO 2 -AlCl 3 -NaOH method [51,52]. Briefly, extracts (150 µL) were added in ddH 2 O (2 mL) and NaNO 2 (5% w/v, 0.3 mL). After the mixture agitating for 5 min, AlCl 3 (10% w/v, 0.3 mL) was added and reacted for 1 min, then NaOH (1.0 mol/L, 2 mL) was added to stop the reaction. Absorbance readings were taken at 510 nm using a spectrometer. Total flavonoid content was calculated based on a standard curve and expressed as mg of catechin.

Measurement of Anthocyanin Content
Anthocyanins content was measured according to a previous protocol [53]. Fresh samples (0.5 g) were placed in methanol (5 mL, 0.1% HCL v/v) and ground, the homogenate was centrifuged at 5000 r/min for 30 s at 4 • C and re-extracted twice more. The extracts were increased to 20 mL with methanol (0.1% HCL v/v). Absorbance readings were taken at 530 nm using a spectrometer. Anthocyanins content was evaluated based on a relative expression level compared to the blank control.

Sample Preparation and Extraction
The freeze-dried samples were ground in a mixer mill with zirconia beads for 1.5 min at 30 Hz. The powder (0.1 g) was added into methanol (70% v/v, 1 mL) and extracted for 12 h at 4 • C, and then the homogenate was centrifuged at 10,000 r/min for 10 min at 4 • C. The supernatant was filtrated with a 0.22 µm durapore membrane for LC-MS/MS analysis.

MS/MS Analysis
The effluent from UPLC was analyzed using an AB SCIEX QTRAP 4500 and Triple Quad 4500 Systems (AB SCIEX, Boston, MA, USA) equipped with an ESI-Turbo Ion-Spray interface and operated in a positive ion mode. The operation parameters were as follows: ESI source temperature 550 • C, ion spray voltage 5500 V, curtain gas 25 psi, collisionactivated dissociation set 5 pis. Triple quadrupole scans were acquired as MRM experiments with optimized de-clustering potential and collision energy CE for each individual multiple reaction monitoring (MRM) transitions. The m/z range was set between 50 and 1000.

Metabolites Identification
Metabolites were identified using internal and public databases (MassBank, KNAp-SAcK, HMDB, MoTo DB and METLIN) and comparing m/z values, retention time, and the fragmentation patterns with the standards.

Differential Metabolites Analysis
The accumulation level of metabolites was ranked using a variable importance in projection (VIP) scores of orthogonal projection to latent structures-discriminant analysis (OPLS-DA). The level of differential accumulation between M1 and M2 was determined with a criterion of VIP ≥ 1 and t-test p ≤ 0.05.

cDNA Library Construction and Single Molecular Real-Time (SMRT) Sequencing
Total RNA was extracted using a Trizol reagent according to the manufacturer's protocol. The quality of extracted RNA was determined using a Agilent 2100 Bioanalyzer and agarose gel electrophoresis. mRNA was enriched by Oligo (dT) magnetic beads and transcribed into cDNA using a Clontech SMARTer cDNA Synthesis Kit. Then the cDNA was amplified by PCR for 13 cycles to prepare for the next SMRTbell library construction. The > 5 kb size sequence was ligated to the sequencing adapters. The SMRTbell template was applied and sequenced on a PacBio SequelII platform (Gene Denovo Biotechnology Co., Ltd., Guangzhou, China).

Isoform Data Processing
The raw sequencing reads of cDNA libraries were analyzed using a isoform sequencing (Iso-Seq) system [54]. Briefly, high quality CCS were extracted and then the FLNC reads were obtained after removing the primers, barcodes, poly (A) tail trimmings and concatemers. The FLNC reads were clustered to generate the entire isoform, which was used for sequences correction. Finally, isoforms were BLAST analyzed against the Nr database, isoforms were annotated against the databases including: KEGG, KOG and Swiss-Prot.

Transcriptomic Analysis and DEGs Identification
Total RNA was extracted using a Trizol reagent according to the manufacturer's protocol. The processes of enrichment by Oligo (dT) magnetic beads, fragmentation by ultrasonic, reverse transcription by a cDNA Synthesis Kit, synthesis of the second-strand cDNA by PCR amplification as well as purification of cDNA fragments by end-repairing and adapter-connecting were conducted according to previous protocols [55]. RNA-seq was performed by an Illumina HiSeqTM 4000 platform (Gene Denovo Biotechnology Co., Ltd., Guangzhou, China). Raw reads were filtered according to previous protocols (Li et al., 2008). Clean reads was assembled using Trinity [56]. The expression level of each transcript was normalized to RPKM values [57], and the differential expression level between M1 and M2 was determined with a criteria of |log 2 (fold-change)| ≥ 1 and p ≤ 0.05 by DESeq2 software and the edgeR package [58,59].

qRT-PCR Validation of Genes Involved in Flavonoid Biosynthesis
The primer sequence (Table 5) was designed via a primer-blast in NCBI and synthesized by reverse transcription (Sangon Biotech Co., Ltd., Shanghai, China). First cDNA was synthesized using a RT Kit. PCR amplification was performed using a SuperReal PreMix. Melting curve was analyzed at 72 • C for 34 s. Actin gene was used as a reference control. The RELs of gene were calculated using a 2 −∆∆Ct method [60].

Statistical Analysis
All experiments were performed in three biological replicates in this study. A t-test in SPSS 22.0 was performed for independent treatments with p < 0.05 as the basis for statistical differences.

Conclusions
From the above observations, the flavonoid and anthocyanin contents in the cultivar M1 of A. sinensis were greater than in M2, which rely on the up-regulation of genes involved in flavonoid and anthocyanin biosynthesis. The difference of stem color formation between M1 and M2 results from the anthocyanin differential accumulation as well as the genes' differential expression.

Informed Consent Statement: Not applicable.
Data Availability Statement: The datasets are publicly available at NCBI with Sequence Read Archive (SRA) accession: SRR16993328 to SRR16993332.

Conflicts of Interest:
All the authors declare that they have no conflicts of interest.